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Abstract — We consider the /c-encoder source coding problem 
with a quadratic distortion measure. We show that among all 
source distributions with a given co variance matrix K, the jointly 
Gaussian source requires the highest rates in order to meet a 
given set of distortion constraints. 

I. Introduction 

The characterization of the rate-distortion region for the k- 
encoder source coding problem, depicted in Figure 1, is one 
of the central open problems in network information theory. 
In this problem, k encoders observe different components of a 
random vector-valued source. Then, without cooperating, the 
encoders transmit messages over rate-constrained, noiseless 
channels to a central decoder, which, based on the k received 
messages, tries to reproduce the original source. The goal is 
to determine which rate tuples allow the decoder 

to reproduce the source so that distortion constraints placed 
on each of the k components are satisfied. 

Most of the work on this problem has focused on the 
case k = 2, and, for some specific distortion constraints, 
the rate-distortion region has been completely characterized. 
When both sources must be reconstructed losslessly, we have 
the classical Slepian-Wolf problem [1]. When one of the 
two sources is available to the decoder as side-information, 
the rate-distortion region was characterized in [2-4] under 
different distortion constraints. The case where one of the 
sources must be reconstructed losslessly while the other must 
satisfy an arbitrary distortion constraint was solved by Berger 
and Yeung [5], and generalizes all the previous cases. 

In [6], the rate-distortion region for the two-encoder source 
coding problem with quadratic distortion constraints and Gaus- 
sian sources was completely characterized. A by-product of 
this result was the characterization of the Gaussian source as 
the worst-case source for the two-encoder quadratic source 
coding problem, generalizing the well known fact that the 
Gaussian source has the largest rate-distortion function for a 
given variance [7, Example 9.7]. 

The importance of characterizing the Gaussian source as 
the worst-case source is two-fold. First, it justifies the study 
of distributed source coding problems for Gaussian sources 
as a way of obtaining a worst-case analysis for more practical 
data source models. The second important aspect is to establish 
the existence of optimal codes for Gaussian sources which are 
robust to changes in the source distribution, i.e., they have the 
same performance guarantees if the sources are non-Gaussian. 



However, for the general k-encoder quadratic Gaussian 
source coding problem, it is still unknown whether the jointly 
Gaussian source is the worst-case source. The proof that the 
jointly Gaussian sources are the worst-case sources for the 
two-encoder problem in [6] follows from the fact that the 
Berger-Tung separation-based architecture [8, 9] is shown to 
be optimal for jointly Gaussian sources, and this architecture 
can achieve the same rate region for any source distribution 
with a given covariance matrix K. Since this separation-based 
architecture is not known to be optimal for the general k- 
encoder problem, the same arguments cannot be extended 
to the general case. Furthermore, it is in general unclear 
what kind of performance guarantees can be obtained when 
codes designed for the fc-encoder source coding problem with 
Gaussian sources are employed with non-Gaussian sources. 
Therefore, in order to address these problems, new techniques 
must be introduced. 
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Fig. 1. The /c-encoder source coding problem. 



Recently, it was shown in [10] that the Gaussian noise is 
the worst-case noise for general multi-hop multi-flow wireless 
networks. The main idea was to apply an OFDM-like scheme 
at all transmitters and receivers in the network in order to 
"mix" different noise realizations over time. This mixing, if 
performed over sufficiently long blocks, allows the Central 
Limit Theorem to kick in, effectively creating a new network 
where the additive noises are approximately Gaussian. This 
allows a coding scheme designed for a wireless network with 
Gaussian noise terms to achieve the same rates of reliable 
communication on a network with non-Gaussian noises. 

In this paper, we show that similar ideas to the ones used 
in [10] can be used in the quadratic ^-encoder source coding 
problem, if the source is not Gaussian. By having each encoder 
apply a DFT-based unitary linear transformation to its vector 
of source symbols, it is possible to create an approximately 



Gaussian source with the same covariance matrix. This allows 
us to prove that, for a given covariance matrix, the jointly 
Gaussian source is indeed the worst-case source for the k- 
encoder source coding problem. Moreover, this technique can 
be seen as a way of modifying codes designed for Gaussian 
sources so that they can be applied to non-Gaussian sources 
and still have a performance guarantee. 

II. Problem Setup and Main Result 

We consider the fc-encoder rate-distortion problem with 
a quadratic distortion measure. In this problem, k encoders 
observe different components of a vector- valued i.i.d. sequence 
{(^i[i], »',Xk[i])}i=o- We assume that (xi[0], x k [0]) has 
an arbitrary distribution with zero mean and covariance ma- 
trix K. Encoder m maps x m = (# m [0], x m [n — 1]) 
to an integer / m (x m ) G {1, 2 nRrn }, which is transmit- 
ted noiselessly to a central decoder. Given the k integers 
/ m (x m ), m = 1, the decoder uses decoding functions 
gi , . . . , g m in order to obtain estimates (x m [0] , . . . , x m [n — 1] ) = 
0m(/i(xi),...,/fc(x fc )), for m = A code for the k- 

Encoder Rate-Distortion problem is comprised of a set of 
encoding and decoding functions (/i, f m , #i, g k ) for a 
given blocklength n. 

Definition 1. Rate -distortion vector (iJi, D\, D k ) 
is achievable if, for some blocklength n, there exists a code 
(A, -,fm,9i, '",9m) far which 



||x m - m (/i(xi), fk(*k)W 



< An, (1) 



for m = 1 , . . . , k. 



The following result establishes that the jointly Gaussian 
distribution is the worst-case source distribution among those 
with covariance matrix K. 

Theorem 1. If rate -distortion vector A 5 Afe) 

is achievable when (xi[0], #fc[0]) w jointly Gaussian with 
covariance matrix K, then, for any e > 0, rate-distortion 
vector (Ri + e, + e, A + e, A + e) w achievable 
when (x±[0], ...,Xk[0]) has any arbitrary distribution with 
covariance matrix K. 

III. Proof of Main Result 

In order to prove Theorem 1, we will need the following 
lemma, whose proof is in the Appendix. 

Lemma 1. Assume (xi[0], £fc[0]) zs jointly Gaussian. For 
any code ( /i ,...,/&, #1 <7fc ) f/m f achieves rate -distortion 
vector (A, Ac, A, A) and any e, e' > 0, 6W£ ca^z 
/ftd another code •••,/&, <h, ifa) achieves the rate- 
distortion vector (A + e, A + e, A + e', A + e') /or 
which the set of discontinuities of each f rn , m = 1, /c, /zas 
Lebesgue measure zero. 

Proof of Theorem 1: Suppose the rate-distortion vec- 
tor (A, A, A, A) is achievable in the case where 
(xi[0], Xk[0]) is jointly Gaussian with covariance matrix 
K. Fix e > 0. From Lemma 1, we can assume that we have a 



code (/i , . . . , fk , gi , . . . , gk) with blocklength n, which achieves 
rate-distortion vector (i?i+e, A+e, A+e/2, A+e/2) 
if (#i[0], #&[()]) is jointly Gaussian, and such that the set 
of discontinuities of each f m , m = 1, has Lebesgue 
measure zero. We will then construct new encoding functions 
/i,...,/^ with blocklength nb, for a large integer 6, where / m 
is applied to the source sequence x m = (x m [0] , . . . , x m [nb — 
1]), for m = 1, k. The construction of these new encoding 
functions is illustrated in Figure 2. Encoder m starts by 
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Fig. 2. Illustration of the new encoding procedure for encoder m. 

applying a unitary (2 -norm preserving) linear transformation 
Q (defined later) to each block of length b. The n resulting 
blocks of length b are then interleaved, generating b length- 
n vectors x^?, ...,x^ _1 \ as shown in Figure 2. The original 
encoding function f m (which takes as input a length-n vector) 
is then individually applied to each , for i = 0, b — 1. 
This generates b integers in {1, 2 n ( jR ™+ e )} which can then 
be combined into a single integer from {1, . .., 2 n6 ( jR ™+ e )} to 
produce the encoder output / m (x m ). 

At the decoder side, each / m (x m ), for m = is 
first broken into the b original integers from {1, 2 n ( jRm+e )}. 
Then, using the original decoding function g m , the decoder 
obtains estimates of x2, for i = 0, b — 1, which can then 
be converted to an estimate of x m by applying Q _1 n times. 
This defines the new decoding functions g m , m = 1, k. 

We define the unitary matrix Q by having the entry in the 
(i + l)th row and (j + l)th column be 



i/Vb 

i-iy/Vb 

276 S in( 2 -^V b/2) ) 



if i = 

if i = l,...,§-l 
ift=| 

if z= | + 6- 



1 



for z,j G {0, — 1}. We point out that applying the 
linear transformation Q to a vector x can be seen as first 
taking the DFT of x, then separating the real and imaginary 
parts of the resulting vector, and renormalizing them so that 
the resulting transformation is unitary. Checking that Q is a 
unitary transformation, i.e., that ||Qx|| = ||x|| for any x G M 6 , 
is straightforward and thus omitted. 



Our next goal is to show that, by choosing b large enough, 
we can make the distortion of this new code arbitrarily close 
to the distortion of the original code applied to the Gaussian 
source. We start by noticing that, since Q is a unitary linear 
transformation, the distortion of our new code can be written 
0, b — 1 as 



in terms of xl? for 



£=0 



For each 6=1,2, we will let 



ih = arg max E 

Q<£<b-1 



^-9rn(h{*k\..,h&?)) 



i.e., the £&th length-n block has the largest expected distortion. 
Note that j [i], [i]^ j is an i.i.d. sequence of 
length- k random vectors. We will snow that it converges in 
distribution to a sequence of i.i.d. jointly Gaussian random 
vectors with covariance matrix K, as b — >> oo. Clearly, it 
suffices to show that (x^ [0], x^ [0]^ converges in dis- 
tribution to a jointly Gaussian random vector with covariance 
matrix K, as b — >> oo. In order to use the Cramer- Wold 
Theorem, we fix an arbitrary vector ...,£&) G R h and we 
notice that 

k k 6-1 

J2 t m X^[0] = tmJ2 X mlj] Q(?b,3) 

3=0 



m=l 



m=l 
6-1 . 



= E 



' k 



j=0 \rn=l ) 



To characterize the convergence in distribution of (2), we will 
need the following result. 

Theorem 2 (Lindeberg's Central Limit Theorem [11]). Sup- 
pose that for each b = 1,2,..., the random variables 
^6,1? ^6,2 5 • ••> ^6,6 independent. In addition, suppose that, 
for all b and i < b, E[Yt>j] = 0, a^d to 



E^n 2 



i=i 



Then, if for all e > 0, Lindeberg's condition 
1 6 

-2^2 E ( Y b,i 1 i\ Y b i i\ >£Sb}) -^Oasb^oo 



(3) 



(4) 



**> i=l 

holds, we have that 



56 



4a/"(o,i). 



To apply Lindeberg's CLT, we will let, for j = 0, b — 1, 

Y bj+i = I ^ t m Zmb1 I Q(hJ). 

\m=l / 



Then, if we let K UjV be the entry in the uth row and vth 
column of K, we have 



x £ ( ^ t m x m [j - 1] J 

\m=l / 
6 

= 6 5^ W.K^^Q 2 (4,j-l) 

l<u,v<k j = l 

l<u,v<k 

regardless of the value of -4. In order to verify Lindeberg's 
condition, we define <r 2 — ^2i< u v<k ^utv^u,v an d we let 
Ubj = Y b 2 3 1 {|n,il > es b } = Y b 2 3 1 | > saVb}. Con- 
sider any sequence j&, for 6=1,2, such that j& G {1, b}, 
and any S > 0. Then we have that 

Pr(^, Jb <(5)>Pr(|n, J J<eaV / 6) 



> Pr 



= Pr 



m=l 



^ ^ tm%m [0] 



m=l 



>/2 < eaV^ 



< ea^b/2 J -> 1, 



as 6 — >> oo, which means that Ubj h as b —> oo. Moreover, 
we have that 



Pr(|^,iJ>*)<Pr 



Pr 



^(^2t m x m \j b -l]\ >t 

\m=l J 

2(^t m x m [0]\ >t 

\m=l / 



for t > 0, and 



2 I> 



2a 2 < OO, 



and by the Dominated Convergence Theorem [11, pages 338- 
339], we have that E[Ubj b ] — >• as 6 — >> oo. We conclude 
that 

i E ^ (i& i {i«i > = il> M 



i=i 



< 4t max £ [t/fo 7 -l 0, 

as 6 — >■ oo, and Lindeberg's condition (4) is satisfied for any 
e > 0. Hence, from Theorem 2, we have that 



Si=l Y b,j 

aVb 



AA(0,1), 



which implies, from (2), that 
»[0] 



k 

^1 tm%m b ^ ^ 



b-1 



m=l j=0 \m=l / 

Finally, since for a jointly Gaussian vector (2/1, 2/fc) with 
mean zero and covariance matrix K, we have ^™=i ^m2/m ~ 
A/*(0, a 2 ), we conclude, from the Cramer- Wold Theorem that 
(^xf b>} [0], #j^[0]^ converges in distribution to a jointly 
Gaussian random vector with zero mean and covariance matrix 
K, as b — > 00. 

Now, since the set of discontinuities of f m , for m = 1, k, 
has Lebesgue measure zero, it is easy to see that the mapping 

for m = 1, k, must also have a set of discontinuities with 
Lebesgue measure zero. We conclude that 

2 



-</ m (A(x^),...,A(x^)) 



^ lly* 



9m (/i(yi),-,/fc(yfc 



as 6 00, where y m = (?/ m [0] , . . . , y m [n - 1]), for m 

and { (2/1 [i] , 2/fc[fc])}£=o * s an i-i-d- sec l uence su ch 
that (yi [0], ...,^[0]) is jointly Gaussian with zero mean and 
covariance matrix K. Moreover, we have that 

W-gm (A(x^),...,A(x^))| 2 



< 2 

< 2 

and also that 
E 



-Ah) 



(/ l( x^),...,/ fc (xi £6) )) 



+ 2 max ||^ m (ci,...,c fc ) 

Ci,...,Cfe 



= n£ ^x m [j] Q(£ b J) 
\j=o 

6-1 

= nK m , m ^Q 2 (4, j) 
3=0 

= nK TO/m < 00. 

Thus, from a variation of the Dominated Convergence Theo- 
rem (see Problem 16.4 in [11]), we conclude that, as b — >> 00, 

^||x^-^ (A(x^),..., /fc (x^))|| 2 

S llYm - £m (A(yi), fk(yk))f < n(D m + e/2). 
Therefore, we can choose b sufficiently large so that 



-E 



-g m (fi(Z? b) ),..,fk(5% b) j) 



< -E\\y n 
n 

<Dm + e 



9m (A(yi),---, fk(yk 



e/2 



The expected distortion of our code (/1 ,...,/&, <h fite ) (with 
blocklength nb) thus satisfies 



^E\\^- 9m (fM%---J k (4 e) )) 



£=0 



< l -E 
n 



^-9m (A(X^),...,A(X^)) 
< £>m + e, 



for m = 1, k. This concludes the proof of Theorem 1. ■ 
Appendix 

Proof of Lemma 1: Let (xi[0], x/JO]) be jointly 
Gaussian. If we assume that the rate-distortion vector 
Rk, Di, Dk) is achievable, for some blocklength 
n, there exists a code (A? /m? #i> # m ) f° r which (1) is 
satisfied for m = We follow the construction from 

[12] to build a code (A, f m , g u # m ) with the same 
blocklength n, which satisfies 



■5m(/l(xi),...,/fc(x fc )) 



< An + e' 



for m = 1, fc. 

Since our code can be repeated over multiple blocks of 
length n, we may assume that n is large enough so that 
2nR m -f 1 < 2 n (^™+ e ) for each m. Focus on encoder A» an d 
let ^ = /f 1 ^), for j e {1, ...,2 ni?1 }. Then, the £/s are a 
partition of R n . For each j, from Theorem 1 1.4 in [1 1], for any 
5 > 0, there exists a countable (in fact, finite) union of disjoint 
bounded rectangles Bj such that Pr[xi £ Bj A Bf\ < S. Then 
we define A as 



/i(xi) 



\ c 



if x G ^ \ U l#i 



otherwise. 



We create the encoders A , • • • , fk m me same way. For m 
1, k, our decoders will be 



9m(Ju -j Jfc) = 







i(ji,-"Jk) if j; 7^ f o r « = 1, 
otherwise. 



The new code is similar to the original one in the sense that, 
if we let A be the event 

[gm (/l(xi),...,/fc(x fc )) 7^ <? m (/i(xi),...,/ fc (Xfc)) 

for some m G {1, &}} 
then, by the union bound, 



Pr [A] < (5 ^ 2 



m=l 



It is clear that this new code has rates at most R\ +e, i?/c+e. 
Following the derivation in [12], the distortion for decoder g\ 



satisfies 
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where M is a finite number, independent of 5. Therefore, we 
can choose S > sufficiently small so that MV^ < ne', 
and the distortion of each decoder g m is at most D m + e'. 
Finally, we need to show that the set of discontinuities of each 
f m has measure zero. If we again focus on fi, this function 
partitions R n into Bj = Bj \ U^jBi for j = l,...,2 njRl 
and Bo = R n \ UjBj. Moreover, since the Bj's were 
countable unions of disjoint bounded rectangles, and the class 
of bounded rectangles forms a semiring [11], the Bj's are also 
countable unions of disjoint bounded rectangles. Therefore, for 
a given j, we can write Bj = UiSi, where the S^s are disjoint 
bounded rectangles. Moreover, we can also write B c - = U^, 
where the TVs are disjoint bounded rectangles. Thus, we have 

dBj =d(UiSi) =R n - (UiSi)° - (U,T,)° 
= (UiSi) U (U,T,) - (UiSi) - (U,T*)° 
C (UiSi) U (UiTi) - (u<s?) - (U^°) 
= (U< (^-^))U(U, (7; -7?)) 
C (U<5S<) U (UiOTi) 

Since the boundary of a bounded rectangle clearly has 
Lebesgue measure zero, we have, for each i, X(dSi) = 
X(dTi) = 0, and we conclude that 

KdBj) < A (dSi) + ^ a (ar<) = 0, 

i i 

implying that the boundary of the partition of R n induced by 
f m has Lebesgue measure zero. ■ 
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